jitSim: A Simulator for Predicting Scalability of Parallel Applications in Presence of OS Jitter

نویسندگان

  • Pradipta De
  • Vijay Mann
چکیده

Traditionally, Operating system jitter has been a source of performance degradation for parallel applications running on large number of processors. While some large scale HPC systems such as Blue Gene/L and Cray XT4, mitigate jitter by making use of a specialized light-weight operating system on compute nodes, other clusters have attempted using HPC-ready commodity operating systems such as ZeptoOS (based on Linux). However, as large systems continue to be designed to work with commodity OSes, OS jitter still remains an active area of research within the HPC community. While, it is true that some of the specialized commodity OSes like ZeptoOS have relatively low OS jitter levels, there is still a need to have a quick and easy set of tools that can predict the impact of OS jitter at a given configuration and processor number. Such tools are also required to validate and compare any new techniques or OS enhancements that mitigate jitter. Emulating jitter on a large ”jitter-free” platform using either synthetic jitter or real traces from commodity OSes has been proposed as one useful mechanism to study scalability behavior under the presence of jitter. However, this requires access to large scale jitter free systems, which are few in number and not so easily accessible. As new systems are built, that should scale up to a million tasks and more, the emulation approach is still limited by the largest jitter free system available. In this paper we present jitSim a simulation framework for predicting scalability of parallel compute intensive applications in presence of OS jitter using trace driven simulation. The jitter simulation framework can be used to quickly simulate the effects of jitter that is characteristic of a given OS using a given trace. Furthermore, this system can be used to predict scalability up to any arbitrarily large number of task counts. Our methodology comprises of collection of real jitter traces, measurement of network latency, message passing stack latency, and shared memory latency. The simulation framework takes the above as inputs and then simulates multiple parallel tasks starting at randomly chosen points in the jitter trace and executing a compute phase. We validate the simulation results by comparing it with real data and demonstrate the efficacy of the simulation framework by evaluating various jitter mitigation techniques through simulation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Prediction and Evaluation

In recent years a range of novel methodologies and tools have been developed for the purpose of evaluation, design, and model reduction of existing and emerging parallel and distributed systems. At the same time, the coverage of the term “performance” has broadened to include reliability, robustness, energy consumption, and scalability, in addition that is to the classical performance-oriented ...

متن کامل

A Simulation-Based Scalability Study of Parallel Systems

Scalability studies of parallel architectures have used scalar metrics to evaluate their performance. Very often, it is difficult to glean the sources of inefficiency resulting from the mismatch between the algorithmic and architectural requirements using such scalar metrics. Low-level performance studies of the hardware are also inadequate for predicting the scalability of the machine on real ...

متن کامل

Dual Phase Detector Based Delay Locked Loop for High Speed Applications

In this paper a new architecture for delay locked loops will be presented.  One of problems in phase-frequency detectors (PFD) is static phase offset or reset path delay. The proposed structure decreases the jitter resulted from PFD by switching two PFDs. In this new architecture, a conventional PFD is used before locking of DLL to decrease the amount of phase difference between input and outpu...

متن کامل

Design and Kinematic Analysis of a 4-DOF Serial-Parallel Manipulator for a Driving Simulator

This paper presents the kinematic analysis and the development of a 4-degree-of-freedom serial-parallel mechanism for large commercial vehicle driving simulators. The degrees of freedom are selected according to the target maneuvers and the structure of human motion perception organs. Several kinematic properties of parallel part of the mechanism under study are investigated, including the inve...

متن کامل

Software of the Earth Simulator

The Earth Simulator (ES) is a distributed-memory parallel system with a peak performance of 40 TFLOPS. The system consists of 640 nodes connected via a fast 640 × 640 single-stage crossbar network. In this paper, an overview of software of the Earth Simulator (the operating system, Operation Management Software and a programming environment) is presented. The operating system of ES (ES OS) is b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010